UrbanAccess Tel Aviv Demo
This notebook provides a brief overview of the main functionality of UrbanAccess with examples using Israel's MOT GTFS data and OpenStreetMap (OSM) pedestrian network data to create an integrated transit and pedestrian network for Tel Aviv District for use in Pandana network accessibility queries.
It is heavily based on a demo notebook from UrbanAccess on UDST: https://github.com/UDST/urbanaccess
Notes:
- GTFS feeds are constantly updated. The feeds in this notebook may change over time which may result in slight differences in results.
Installation:¶
For UrbanAccess installation instructions see: https://udst.github.io/urbanaccess/installation.html
This notebook contains optional Pandana examples which require the installation of Pandana, for instructions see here: http://udst.github.io/pandana/installation.html
Note that only Python 2.7 is available right now, and installing Fiona on windows is a bit of a hassle.
Outline:¶
- The settings object
- The feeds object and searching for GTFS feeds
- Downloading GTFS data
- Loading GTFS data into a UrbanAccess transit data object
- Creating a transit network
- Downloading OSM data
- Creating a pedestrian network
- Creating an integrated transit and pedestrian network
- Saving a network to disk
- Loading a network from disk
- Visualizing the network
- Adding average headways to network travel time
- Using a UrbanAccess network with Pandana
import pandas as pd
import pandana as pdna
from pandana.loaders import osm
import time
import urbanaccess as ua
from urbanaccess.config import settings
from urbanaccess.gtfsfeeds import feeds
from urbanaccess import gtfsfeeds
from urbanaccess.gtfs.gtfsfeeds_dataframe import gtfsfeeds_dfs
from urbanaccess.network import ua_network, load_network
%matplotlib inline
# Pandana currently uses depreciated parameters in matplotlib, this hides the warning until its fixed
import warnings
import matplotlib.cbook
warnings.filterwarnings("ignore",category=matplotlib.cbook.mplDeprecation)
The settings object¶
The settings object is a global urbanaccess_config object that can be used to set default options in UrbanAccess. In general, these options do not need to be changed.
settings.to_dict()
For example, you can stop printing in notebooks and only print to console by setting:
settings.log_console = True
The feeds object¶
The GTFS feeds object is a global urbanaccess_gtfsfeeds object that allows you to save and manage information needed to download multiple GTFS feeds. This object is a dictionary of the names of GTFS feeds or agencies and the URLs to use to download the corresponding feeds.
feeds.to_dict()
Searching for GTFS feeds¶
You can use the search function to find feeds on the GTFS Data Exchange (Note: the GTFS Data Exchange is no longer being maintained as of Summer 2016 so feeds here may be out of date)
If you know of a GTFS feed located elsewhere or one that is more up to date, you can add additional feeds located at custom URLs by adding a dictionary with the key as the name of the service/agency and the value as the URL.
feeds.add_feed(add_dict={'israel': 'ftp://gtfs.mot.gov.il/israel-public-transportation.zip'})
Note the two GTFS feeds now in your feeds object ready to download
feeds.to_dict()
Downloading GTFS data¶
Use the download function to download all the feeds in your feeds object at once. If no parameters are specified the existing feeds object will be used to acquire the data.
By default, your data will be downloaded into the directory of this notebook in the folder: data
gtfsfeeds.download()
Load GTFS data into an UrbanAccess transit data object¶
Now that we have downloaded our data let's load our individual GTFS feeds (currently a series of text files stored on disk) into a combined network of Pandas DataFrames.
- You can specify one feed or multiple feeds that are inside a root folder using the
gtfsfeed_pathparameter. If you want to aggregate multiple transit networks together, all the GTFS feeds you want to aggregate must be inside of a single root folder. - Turn on
validationand set a bounding box with theremove_stops_outsidebboxparameter turned on to ensure all your GTFS feed data are within a specified area.
Let's specify a bounding box of coordinates for Tel Aviv District to subset the GTFS data to. You can generate a bounding box by going to http://boundingbox.klokantech.com/ and selecting the CSV format.
validation = True
verbose = True
# bbox for Tel Aviv District
bbox = (34.732918,31.988688,34.876007,32.202171)
remove_stops_outsidebbox = True
append_definitions = True
loaded_feeds = ua.gtfs.load.gtfsfeed_to_df(gtfsfeed_path=None,
validation=validation,
verbose=verbose,
bbox=bbox,
remove_stops_outsidebbox=remove_stops_outsidebbox,
append_definitions=append_definitions)
The transit data object¶
The output is a global urbanaccess_gtfs_df object that can be accessed with the specified variable loaded_feeds. This object holds all the individual GTFS feed files aggregated together with each GTFS feed file type in separate Pandas DataFrames to represent all the loaded transit feeds in a metropolitan area.
loaded_feeds.stops.head()
Note the two transit services we have aggregated into one regional table
Quickly view the transit stop locations
loaded_feeds.stops.plot(kind='scatter', x='stop_lon', y='stop_lat', s=0.1)
loaded_feeds.agencies.head()
loaded_feeds.routes.head()
loaded_feeds.stop_times.head()
loaded_feeds.trips.head()
loaded_feeds.calendar.head()
Create a transit network¶
Now that we have loaded and standardized our GTFS data, let's create a travel time weighted graph from the GTFS feeds we have loaded.
Create a network for weekday monday service between 7 am and 10 am (['07:00:00', '10:00:00']) to represent travel times during the AM Peak period.
Assumptions: We are using the service ids in the calendar file to subset the day of week, however if your feed uses the calendar_dates file and not the calendar file then you can use the calendar_dates_lookup parameter. This is not required for AC Transit and BART.
ua.gtfs.network.create_transit_net(gtfsfeeds_dfs=loaded_feeds,
day='monday',
timerange=['07:00:00', '10:00:00'],
calendar_dates_lookup=None)
The UrbanAccess network object¶
The output is a global urbanaccess_network object. This object holds the resulting graph comprised of nodes and edges for the processed GTFS network data for services operating at the day and time you specified inside of transit_edges and transit_nodes.
Let's set the global network object to a variable called urbanaccess_net that we can then inspect:
urbanaccess_net = ua.network.ua_network
urbanaccess_net.transit_edges.head()
urbanaccess_net.transit_nodes.head()
urbanaccess_net.transit_nodes.plot(kind='scatter', x='x', y='y', s=0.1)
Download OSM data¶
Now let's download OpenStreetMap (OSM) pedestrian street network data to produce a graph network of nodes and edges for Tel Aviv District. We will use the same bounding box as before.
nodes, edges = ua.osm.load.ua_network_from_bbox(bbox=bbox,
remove_lcn=True)
Create a pedestrian network¶
Now that we have our pedestrian network data let's create a travel time weighted graph from the pedestrian network we have loaded and add it to our existing UrbanAccess network object. We will assume a pedestrian travels on average at 3 mph.
The resulting weighted network will be added to your UrbanAccess network object inside osm_nodes and osm_edges
ua.osm.network.create_osm_net(osm_edges=edges,
osm_nodes=nodes,
travel_speed_mph=3)
Let's inspect the results which we can access inside of the existing urbanaccess_net variable:
urbanaccess_net.osm_nodes.head()
urbanaccess_net.osm_edges.head()
urbanaccess_net.osm_nodes.plot(kind='scatter', x='x', y='y', s=0.1)
Create an integrated transit and pedestrian network¶
Now let's integrate the two networks together. The resulting graph will be added to your existing UrbanAccess network object. After running this step, your network will be ready to be used with Pandana.
The resulting integrated network will be added to your UrbanAccess network object inside net_nodes and net_edges
ua.network.integrate_network(urbanaccess_network=urbanaccess_net,
headways=False)
Let's inspect the results which we can access inside of the existing urbanaccess_net variable:
urbanaccess_net.net_nodes.head()
urbanaccess_net.net_edges.head()
urbanaccess_net.net_edges[urbanaccess_net.net_edges['net_type'] == 'transit'].head()
Save the network to disk¶
You can save the final processed integrated network net_nodes and net_edges to disk inside of a HDF5 file. By default the file will be saved to the directory of this notebook in the folder data
ua.network.save_network(urbanaccess_network=urbanaccess_net,
filename='final_net.h5',
overwrite_key = True)
Load saved network from disk¶
You can load an existing processed integrated network HDF5 file from disk into a UrbanAccess network object.
urbanaccess_net = ua.network.load_network(filename='final_net.h5')
Visualize the network¶
You can visualize the network you just created using basic UrbanAccess plot functions
Integrated network¶
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges,
bbox=bbox,
fig_height=30, margin=0.02,
edge_color='#999999', edge_linewidth=1, edge_alpha=1,
node_color='black', node_size=1.1, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
Integrated network by travel time¶
Use the col_colors function to color edges by travel time. In this case the darker red the higher the travel times.
edgecolor = ua.plot.col_colors(df=urbanaccess_net.net_edges, col='weight', cmap='gist_heat_r', num_bins=5)
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges,
bbox=bbox,
fig_height=30, margin=0.02,
edge_color=edgecolor, edge_linewidth=1, edge_alpha=0.7,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
Let's zoom in closer to downtown Tel-Aviv-Yafo using a new smaller extent bbox.
edgecolor = ua.plot.col_colors(df=urbanaccess_net.net_edges, col='weight', cmap='gist_heat_r', num_bins=5)
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges,
bbox=(34.747867,32.032876,34.784817,32.067477),
fig_height=30, margin=0.02,
edge_color=edgecolor, edge_linewidth=1, edge_alpha=0.7,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
Transit network¶
You can also slice the network by network type
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges[urbanaccess_net.net_edges['net_type']=='transit'],
bbox=None,
fig_height=30, margin=0.02,
edge_color='#999999', edge_linewidth=1, edge_alpha=1,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
Pedestrian network¶
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges[urbanaccess_net.net_edges['net_type']=='walk'],
bbox=None,
fig_height=30, margin=0.02,
edge_color='#999999', edge_linewidth=1, edge_alpha=1,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
Using an UrbanAccess network with Pandana¶
Pandana (Pandas Network Analysis) is a tool to compute network accessibility metrics.
Now that we have an integrated transit and pedestrian network that has been formatted for use with Pandana, we can now use Pandana right away to compute accessibility metrics.
There are a couple of things to remember about UrbanAccess and Pandana:
- UrbanAccess generates by default a one way network. One way means there is an explicit edge for each direction in the edge table. Where applicable, it is important to set any Pandana
two_wayparameters toFalse(they areTrueby default) to indicate that the network is a one way network. - As of Pandana v0.3.0,
node idsandfromandtocolumns in your network must be integer type and not string. UrbanAccess automatically generates both string and integer types so use thefrom_intandto_intcolumns in edges and the index in nodesid_int. - UrbanAccess by default will generate edge weights that represent travel time in units of minutes.
For more on Pandana see the:
Pandana repo: https://github.com/UDST/pandana
Pandana documentation: http://udst.github.io/pandana/
Initialize the Pandana network¶
Let's initialize our Pandana network object using our transit and pedestrian network we created. Note: the from_int and to_int as well as the twoway=False denoting this is a explicit one way network.
s_time = time.time()
transit_ped_net = pdna.Network(urbanaccess_net.net_nodes["x"],
urbanaccess_net.net_nodes["y"],
urbanaccess_net.net_edges["from_int"],
urbanaccess_net.net_edges["to_int"],
urbanaccess_net.net_edges[["weight"]],
twoway=False)
print('Took {:,.2f} seconds'.format(time.time() - s_time))
Amenity accessibility¶
# configure search at a max distance of 1 km for up to the 10 nearest points-of-interest
amenities = ['restaurant', 'bar', 'pub', 'ice_cream', 'cafe', 'food_court',
'cinema', 'arts_centre', 'community_centre', 'nightclub', 'social_centre', 'studio', 'theatre',
'school', 'college', 'kindergarten', 'library', 'university',
'atm', 'bank',
'clinic', 'dentist', 'doctors', 'hospital', 'pharmacy', 'veterinary']
distance = 1000
num_pois = 10
num_categories = len(amenities) + 1 #one for each amenity, plus one extra for all of them combined
pandana_bbox = (31.988688, 34.732918, 32.202171, 34.876007)
# configure filenames to save/load POI and network datasets
bbox_string = '_'.join([str(x) for x in pandana_bbox])
net_filename = 'data/network_{}.h5'.format(bbox_string)
poi_filename = 'data/pois_{}_{}.csv'.format(''.join([a[0] for a in amenities]), bbox_string)
poi_filename
# keyword arguments to pass for the matplotlib figure
bbox_aspect_ratio = (pandana_bbox[2] - pandana_bbox[0]) / (pandana_bbox[3] - pandana_bbox[1])
fig_kwargs = {'facecolor':'w',
'figsize':(10, 10 * bbox_aspect_ratio)}
# keyword arguments to pass for scatter plots
plot_kwargs = {'s':5,
'alpha':0.9,
'cmap':'viridis_r',
'edgecolor':'none'}
# network aggregation plots are the same as regular scatter plots, but without a reversed colormap
agg_plot_kwargs = plot_kwargs.copy()
agg_plot_kwargs['cmap'] = 'viridis'
# keyword arguments to pass for hex bin plots
hex_plot_kwargs = {'gridsize':60,
'alpha':0.9,
'cmap':'viridis_r',
'edgecolor':'none'}
# keyword arguments to pass to make the colorbar
cbar_kwargs = {}
# keyword arguments to pass to basemap
bmap_kwargs = {}
# color to make the background of the axis
bgcolor = 'k'
Download points of interest (POIs) and network data from OSM¶
First get the points of interest - either load an existing set for the specified amenities and bounding box from CSV, or get it from the OSM API.
import os
start_time = time.time()
if os.path.isfile(poi_filename):
# if a points-of-interest file already exists, just load the dataset from that
pois = pd.read_csv(poi_filename)
method = 'loaded from CSV'
else:
# otherwise, query the OSM API for the specified amenities within the bounding box
osm_tags = '"amenity"~"{}"'.format('|'.join(amenities))
pois = osm.node_query(pandana_bbox[0], pandana_bbox[1], pandana_bbox[2], pandana_bbox[3], tags=osm_tags)
# using the '"amenity"~"school"' returns preschools etc, so drop any that aren't just 'school' then save to CSV
pois = pois[pois['amenity'].isin(amenities)]
pois.to_csv(poi_filename, index=False, encoding='utf-8')
method = 'downloaded from OSM'
print('{:,} POIs {} in {:,.2f} seconds'.format(len(pois), method, time.time()-start_time))
pois[['amenity', 'name', 'lat', 'lon']].head()
# how many points of interest of each type of amenity did we retrieve?
pois['amenity'].value_counts()
pois['node_id'] = transit_ped_net.get_node_ids(pois['lon'], pois['lat'])
pois['node_id'].nunique()
pois[['node_id', 'amenity', 'name', 'lat', 'lon']].head()
pois_gr = pois.groupby('node_id').size().reset_index()
pois_gr.loc[:,0].sum()
transit_ped_net.set(pois_gr.node_id, variable = pois_gr.loc[:,0], name='amens')
s_time = time.time()
amens_45 = transit_ped_net.aggregate(45, type='sum', decay='linear', name='amens')
amens_30 = transit_ped_net.aggregate(30, type='sum', decay='linear', name='amens')
amens_15 = transit_ped_net.aggregate(15, type='sum', decay='linear', name='amens')
print('Took {:,.2f} seconds'.format(time.time() - s_time))
print amens_45.head()
print amens_30.head()
print amens_15.head()
Amenities accessible within 15 minutes¶
s_time = time.time()
transit_ped_net.plot(amens_15,
plot_type='scatter',
fig_kwargs={'figsize':[20,20]},
bmap_kwargs={'epsg':'4141','resolution':'h'},
plot_kwargs={'cmap':'gist_heat_r','s':4,'edgecolor':'none'})
print('Took {:,.2f} seconds'.format(time.time() - s_time))
Amenities accessible within 30 minutes¶
s_time = time.time()
transit_ped_net.plot(amens_30,
plot_type='scatter',
fig_kwargs={'figsize':[20,20]},
bmap_kwargs={'epsg':'4141','resolution':'h'},
plot_kwargs={'cmap':'gist_heat_r','s':4,'edgecolor':'none'})
print('Took {:,.2f} seconds'.format(time.time() - s_time))
Amenities accessible within 45 minutes¶
s_time = time.time()
transit_ped_net.plot(amens_45,
plot_type='scatter',
fig_kwargs={'figsize':[20,20]},
bmap_kwargs={'epsg':'4141','resolution':'h'},
plot_kwargs={'cmap':'gist_heat_r','s':4,'edgecolor':'none'})
print('Took {:,.2f} seconds'.format(time.time() - s_time))
Add average headways to network travel time¶
Calculate route stop level headways¶
The network we have generated so far only contains pure travel times. UrbanAccess allows for the calculation of and addition of route stop level average headways to the network. This is used as a proxy for passenger wait times at stops and stations. The route stop level average headway are added to the pedestrian to transit connector edges.
Let's calculate headways for the same AM Peak time period. Statistics on route stop level headways will be added to your GTFS transit data object inside of headways
ua.gtfs.headways.headways(gtfsfeeds_df=loaded_feeds,
headway_timerange=['07:00:00','10:00:00'])
loaded_feeds.headways.head()
Add the route stop level average headways to your integrated network¶
Now that headways have been calculated and added to your GTFS transit feed object, you can use them to generate a new integrated network that incorporates the headways within the pedestrian to transit connector edge travel times.
ua.network.integrate_network(urbanaccess_network=urbanaccess_net,
headways=True,
urbanaccess_gtfsfeeds_df=loaded_feeds,
headway_statistic='mean')
Integrated network by travel time with average headways¶
edgecolor = ua.plot.col_colors(df=urbanaccess_net.net_edges, col='weight', cmap='gist_heat_r', num_bins=5)
ua.plot.plot_net(nodes=urbanaccess_net.net_nodes,
edges=urbanaccess_net.net_edges,
bbox=bbox,
fig_height=30, margin=0.02,
edge_color=edgecolor, edge_linewidth=1, edge_alpha=0.7,
node_color='black', node_size=0, node_alpha=1, node_edgecolor='none', node_zorder=3, nodes_only=False)
Pandana again¶
This time using headway times.
s_time = time.time()
transit_ped_net_headway = pdna.Network(urbanaccess_net.net_nodes["x"],
urbanaccess_net.net_nodes["y"],
urbanaccess_net.net_edges["from_int"],
urbanaccess_net.net_edges["to_int"],
urbanaccess_net.net_edges[["weight"]],
twoway=False)
print('Took {:,.2f} seconds'.format(time.time() - s_time))
pois['node_id'] = transit_ped_net_headway.get_node_ids(pois['lon'], pois['lat'])
pois_gr = pois.groupby('node_id').size().reset_index()
pois_gr.loc[:,0].sum()
transit_ped_net_headway.set(pois_gr.node_id, variable = pois_gr.loc[:,0], name='amens')
s_time = time.time()
amens_45_hw = transit_ped_net_headway.aggregate(45, type='sum', decay='linear', name='amens')
amens_30_hw = transit_ped_net_headway.aggregate(30, type='sum', decay='linear', name='amens')
amens_15_hw = transit_ped_net_headway.aggregate(15, type='sum', decay='linear', name='amens')
print('Took {:,.2f} seconds'.format(time.time() - s_time))
print amens_45_hw.head()
print amens_30_hw.head()
print amens_15_hw.head()
Amenities accessible within 15 minutes¶
s_time = time.time()
transit_ped_net_headway.plot(amens_15_hw,
plot_type='scatter',
fig_kwargs={'figsize':[20,20]},
bmap_kwargs={'epsg':'4141','resolution':'h'},
plot_kwargs={'cmap':'gist_heat_r','s':4,'edgecolor':'none'})
print('Took {:,.2f} seconds'.format(time.time() - s_time))
Amenities accessible within 30 minutes¶
s_time = time.time()
transit_ped_net_headway.plot(amens_30_hw,
plot_type='scatter',
fig_kwargs={'figsize':[20,20]},
bmap_kwargs={'epsg':'4141','resolution':'h'},
plot_kwargs={'cmap':'gist_heat_r','s':4,'edgecolor':'none'})
print('Took {:,.2f} seconds'.format(time.time() - s_time))
Amenities accessible within 45 minutes¶
s_time = time.time()
transit_ped_net_headway.plot(amens_45_hw,
plot_type='scatter',
fig_kwargs={'figsize':[20,20]},
bmap_kwargs={'epsg':'4141','resolution':'h'},
plot_kwargs={'cmap':'gist_heat_r','s':4,'edgecolor':'none'})
print('Took {:,.2f} seconds'.format(time.time() - s_time))
Comments !